Robust muscle force prediction using NMFSEMD denoising and FOS identification

In this paper, an aliasing noise restraint technique and a system identification-based surface electromyography (sEMG)-force prediction model are proposed to realize a type of robust sEMG and muscle force prediction. For signal denoising, a novel non-negative matrix factorization screening empirical mode decomposition (NMFSEMD) and a fast orthogonal search (FOS)-based muscle force prediction model are developed. First, the NMFSEMD model is used to screen the empirical mode decomposition (EMD) results into the noisy intrinsic mode functions (IMF). Then, the noise matrix is computed using IMF translation and superposition, and the matrix is used as the input of NMF to obtain the denoised IMF. Furthermore, the reconstruction outcome of the NMFSEMD method can be used to estimate the denoised sEMG. Finally, a new sEMG muscle force prediction model, which considers a kind of candidate function in derivative form, is constructed, and a data-training-based linear weighted model is obtained. Extensive experimental results validate the suggested method’s correction: after the NMFSEMD denoising of raw sEMG signal, the signal-noise ratio (SNR) can be improved by about 15.0 dB, and the energy percentage (EP) can be greater than 90.0%. Comparing with the muscle force prediction models using the traditional pretreatment and LSSVM, and the NMFSEMD plus LSSVM-based method, the mean square error (MSE) of our approach can be reduced by at least 1.2%.


Introduction
When the astronauts work in the microgravity environment of space stations for an extended period, they experience bone loss and muscle atrophy [1][2][3], resulting in a substantial loss of muscle strength, which impacts the astronaut's physical health and the ability to perform tasks, such as handling equipment and opening cabin doors. After returning to Earth, astronauts typically need a period of rehabilitation training to regain their physical health. Direct and indirect measurement methods are commonly used to assess muscle strength. approach causes some harms to the human body and is not portable when astronauts are active, so it has limited application. The latter method evaluates muscle force by analyzing the surface electromyogram (sEMG) signals of an astronaut. Clearly, this technique is an efficient and noninvasive technique with a broad application. Research on muscle force prediction is useful for monitoring the muscle force status of astronauts in real-time and developing targeted on-orbit rehabilitation training programs.
The muscle force prediction model, including the physiological and the phenomenal models, establishes a nonlinear relationship between the sEMG signals and muscle force. A typical physiological model, the Hill model, is based on various physiological parameters during muscle contraction [4]. Its simplified model has good adaptability and low computational amount because it ignores muscle fatigue; however, its accuracy is poor and can only predict one muscle. Researchers have committed to finding an optimal strategy with low complexity and high accuracy by using sensitivity analysis [5,6]. The phenomenal model includes polynomial fitting, artificial neural networks, and system identification. When using polynomial fitting, some authors predicted the joint torque of biceps brachii [7]; other authors found that muscle force prediction was the best when the polynomial fitting order was more than three [8]. The polynomial fitting has a simple structure and shorter training time; however, it has low accuracy. Thus, some researchers adopted a neural network to predict hand and palm-grip forces during elbow motion [9,10]. The accuracy of an artificial neural network depends on its structure and parameters, and the training typically takes a long time.
Another phenomenal model used for muscle-force prediction is the system identification method. Popular methods are parallel cascade identification (PCI) and fast orthogonal search (FOS). Some authors used PCI to build a relationship between the upper arm sEMG signals and wrist induction force [11]. PCI considered the dynamic linear finite impulse response and static nonlinear fitting, which realized the fusion of dynamic and static information. However, it took a long training time. Conversely, the FOS method established prediction models by minimizing the mean square error (MSE) between the estimated and true values of the system output [12]. Some authors used FOS to predict the muscle force of upper arm and forearm muscles during elbow flexion and extension [13]. Thus, FOS quickly achieved muscle force prediction and resulted in high accuracy by building suitable candidate functions.
Surface EMG signals have non-linear and non-stationary characteristics, and are usually processed by time-frequency analysis. However, traditional analysis is based on Fourier analysis, which cannot express local frequency change with time. Empirical mode decomposition (EMD) overcomes this limitation and is widely used in physiological signal analysis, fault diagnosis, and other fields. EMD has problems such as modal aliasing and end effect [14]. Scholars have proposed improved EMD methods to suppress modal aliasing. For example, research on improvement based on noise assistance, ensemble empirical mode decomposition (EEMD), and noise assisted multivariate empirical mode decomposition (NAMEMD) were realized by adding white noise to EMD decomposition [15][16][17]. Research on improvement based on mathematics, including local fitting curve replaced cubic interpolation curve [18]. Research on improvement based on EMD decomposition characteristics, independent component analysis (ICA) was used to eliminate mode aliasing in EMD decomposition results [19].
In this study, we propose a non-negative matrix factorization screening empirical mode (NMFSEMD) denoising approach for removing the aliasing noise in the sEMG signals by analyzing the noise distribution of sEMG at various time characteristic scales. Using the denoised sEMG signal as the input of muscle force prediction model, the reliability and accuracy of the model can be improved. Additionally, we apply the system identification model FOS to establish the prediction model. We preset several forms of candidate functions according to the time-domain characteristics of sEMG signals, then employ an iterative optimization technique to select several basis functions, and use their weighted sums as the muscle force prediction model.
The major contributions of this investigation are as follows: a novel NMFSEMD method for denoising sEMG signals is proposed and a candidate function in the form of a derivative to the FOS model to predict the muscle force is considered. This noise reduction technique has three main advantages: (i) as an improved algorithm of EMD, NMFSEMD can better suppress mode mixture and improve the data reliability of sEMG in muscle force prediction; (ii) NMFSEMD can distinguish the noisy intrinsic mode functions (IMF) components and the real intrinsic mode functions components (IMFs) from the EMD decomposition results by using the designed IMFs screening process; (iii) NMFSEMD is a data fusion method essentially and can extract several sequences that can best represent the state of muscle activation when used in sEMG signals.
The remainder of this paper is organized as follows. First, we will introduce the principle of proposed NMFSEMD. Second, we present the process of building a muscle force prediction model using FOS. Finally, experimental findings and discussions will be presented.

Proposed sEMG-force-prediction method
In the muscle force prediction process, the sEMG signal preprocessing is first considered to improve the reliability of modeling data. Conventional preprocessing methods are initially used for sEMG signal, such as outlier detection. Then, the denoising and preprocessing of sEMG signals are conducted by integrating the EMD with NMF, i.e., the proposed NMFSEMD denoising method. Subsequently, the denoised sEMG signals are obtained and used as the input of the muscle force prediction model. Then, basic functions of FOS are computed using the denoised sEMG signals in the form of preseted candidate functions. Finally, the muscle force prediction model represented by the linear combination of several basic functions is obtained and can be calculated through FOS model training by comparing the measured and predictive muscle forces in the model training process. The flowchart of sEMG-based muscle force prediction is shown in Fig 1. 2.1 sEMG signal denoising using EMD and NMF 2.1.1 EMD method. While acquiring the sEMG signals, the power line interference, ambient noise, motion artifact interference [20][21][22], and other noises can be observed. The frequency range of sEMG signals is 0.0 to 500.0 Hz [23], and the main frequency range of sEMG signals is 50.0 to 150.0 Hz. The frequencies of motion artifact interference, ambient noise, and power noise are from 0 to 20.0 Hz, 40.0 to 60.0 Hz, and 50.0 Hz, respectively. The traditional noise reduction methods used for analyzing the sEMG signals are bandpass filtering, wavelet filtering, EMD, or blind source separation (BSS) [21,[24][25][26]. However, the bandpass filtering method cannot filter out the noise mixed in the main spectrum. Moreover, it is different to use the wavelet filtering method to accurately evaluate the denoising results. The BSS method has many strict restrictions and requirements. For example, the source signals obey Gaussian distribution and their intensities should be less than 1.0; each source signal is statistically independent. According to its characteristic time scale, the EMD approach can adaptively decompose the signal into a series of IMF with frequency values ranging from high to low. The characteristic time scale denotes the signal change process that can be directly observed from the timedomain and represent the local frequency features. Thus, the noise in sEMG signals can be removed by processing intrinsic mode functions components (IMFs) which containing different frequency information.
The theory of EMD is based on the idea that any real signals can be decomposed into a set of simple IMFs, which are independent of each other [27]. EMD decomposes the fluctuations of different scales in a signal step by step and produces a series of data sequences of different scales. Each sequence is called an IMF. Each IMF should satisfy two conditions [28][29][30], namely, the difference between the number of local extreme points and zero points is not more than one; and at any time, the average value of upper envelope formed by the local maximum point and the lower envelope formed by the local minimum point is zero. The calculation process of EMD is as follows: 1. The upper and lower envelopes are obtained by cubic spline interpolation of the local maxima and local minima points of signal x(t), respectively.

The difference h(t) between
x(t) and the average value m(t) of the upper and lower envelopes is calculated.
3. If h(t) meets both the IMF conditions, it is the first IMF; otherwise, it is the new x(t). The preceding steps are repeated until the difference after k times satisfies the IMF conditions and is recorded c(t) = h k (t) as the first IMF.
4. The existing IMF c(t) from x(t) is removed to obtain the remaining component r(t), and r (t) is considered as the new x(t). Then the preceding steps are repeated to get the remaining (n-1) IMFs.
5. When the residual r N (t) is a monotone function, the algorithm can be stopped. The expression of signal x(t) after EMD decomposition is shown in Eq (1).
where c n (t) (n = 1, . . ., N) is the IMF and r N (t) is the residual error.

NMF method.
Some noises in the sEMG signals can be filtered out by discarding the corresponding IMFs after EMD decomposition [15]. However, the noises mixed in the main spectra of signals still remain. Thus, the BSS method is used to filter out these noises because it can recover "source signals" that cannot be directly observed from the "mixed signals" [31]. BSS methods include principal component analysis (PCA) [32], ICA [33], and NMF [34]. PCA uses the linear transformation of matrix to reduce data dimensions and eliminate redundant data. ICA and EMD decompose the matrix to find the basis vectors representing the local features of the observed matrix. ICA and EMD are more suitable than PCA for obtaining muscle activity feature from the sEMG signals. Additionally, ICA results must be statistically independent, with at least one component obeys Gaussian distribution, and the observed signal dimension should not be less than the decomposed dimension. However, the distribution and quantity of NMF results have no specific requirements, and the non-negative of the decomposed matrix makes the result more meaningful in physiological signal analysis. Thus, NMF is more suitable for denoising the sEMG signals in this work.
For a given non-negative matrix V, the basic principle of NMF can be described in [34]. Herein, we find two non-negative matrices W and H, whose product approaches V, as shown in Eq (2).
where r is the target dimension, m is the dimension of input matrix row, and n is the dimension of input matrix column. The key of NMF is the construction of loss and optimization functions. The loss function based on Euclidean distance and Kullback-Leibler divergence can meet the requirements of matrix factorization in NMF [35]. First, we initialize W and H, and then their iterative equations are computed by minimizing the optimization function. The NMF algorithm converges in a finite number of iterations.

Proposed NMFSEMD algorithm.
EMD can decompose the raw sEMG signals into a series of IMFs with different time scales; however, not every IMF contains useful information about the raw signal. To distinguish the IMFs with useful information and the IMFs with noisy information, we design their respective processes to achieve noise reduction in this section, i.e., an NMFSEMD denoising method is proposed. The implementation process of the NMFSEMD method is shown in Fig 2. The designs of NMFSEMD method are as follows: First, a set of IMFs with decaying frequencies can be obtained using the EMD. The frequency spectrum of each IMF is a part of the raw sEMG signal spectrum, whereas the total frequency distribution of IMFs is the frequency distribution of signal. For a set of IMFs, the higher the index of an IMF is, the higher its rank would be. The low-rank IMFs have a smaller time scale and a wider frequency domain, whereas the high-rank IMFs have a larger time scale and a narrow frequency domain. According to the spectrum analysis of sEMG noise presented in Section 2.1.1, the IMFs distributed in the low frequency mainly represent the low-frequency noise in the signal, and the noise can be eliminated directly by removing the corresponding IMFs. However, low-rank IMFs with a wider frequency domain overlap the main spectrum of sEMG. Thus, it is difficult to filter out these noises by removing IMFs. This unfavorable phenomenon where the noise frequency overlaps with the main frequency of signal is called the modal aliasing problem in EMD. That is, multiple time feature scales appear in one IMF, or one-time feature scale appears in multiple IMFs [26]. Ideally, each IMF only contains one time feature scale [27]. The proposed NMFSEMD method can address this modal aliasing problem by screening the IMFs, filtering out the noise in them, and finally reconstructing the denoised sEMG signal containing the main muscle activation information.
The specific steps of NMFSEMD are as follows: 1. The noisy sEMG signal is decomposed using the EMD method to obtain a set of IMFs with different time scales: 2. In the abovementioned qualitative analysis of IMFs, the IMFs indicating noise should be removed, the IMFs with modal aliasing should be denoised, and the IMFs containing the main information of signal should be directly used for signal reconstruction. To achieve this goal in quantitative analysis, the concept of "correlation" is used to identify different types of IMFs. The detailed steps are as follows: • First, the Pearson correlation value between each IMF and raw sEMG signal is calculated, as shown in Eq (3).
ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi where x is the raw sEMG signal, imf i is the i-th component of the IMF set, n is the number of sampling points, � x is the mean value of x, and � imf i is the mean value of imf i . The value of R is in the range [−1, 1]. The higher the value of R is, the stronger the correlation between x and imf i would be.
• Second, imf add , imf nmf , and imf p are searched according to the obtained correlation set: CR ¼ fRðx; imf 1 Þ; Rðx; imf 2 Þ; . . .; Rðx; imf m Þg. The symbol imf add is used to construct the denoised signal and the input matrix of NMF. The symbol imf nmf involves the phenomenon of model aliasing and should be denoised and then used to construct the denoised signal. The symbol imf p contains the main information about the signal and is directly used to construct the denoised signal. Clearly, the imf p does not exist in all IMF set. The selection strategies of three IMFs are as follows: cr th called the preseted correlation threshold, which is an empirical boundary value that determines whether the IMF treated as noise is to be discarded. The symbol cr e is called the empirical maximum correlation, which is the maximum correlation at the positions where the higher correlations often appear in set. if set ind c is not empty, then imf p is the superposition of all IMFs in set imfs c .
if set ind b has more than 1 IMF component, then imf nmf is the IMF with the smallest correlation in set imfs b , imf add is the superposition of other IMFs in the set imfs b except imf nmf .
else imf nmf is the IMF with the largest correlation in set imfs a , imf add is the IMF in set imfs b .
if set ind a and ind c are not empty, set ind b is empty: if set ind c has more than 2 IMF components, then imf p is the IMF with the largest correlation in set imfs c , imf nmf is the IMF with the smallest correlation in set imfs c , imf add is the superposition of other IMFs in set imfs c except imf p and imf nmf . else imf p is the superposition of all IMFs in set imfs c , imf nmf is the IMF with the largest correlation in set imfs a . if set ind a is not empty, set ind b and ind c are empty: imf nmf is the IMF with the largest correlation in set imfs a , imf add is the superposition of other IMFs in set imfs a except imf nmf . end • Then, the IMFs that are not selected to construct imf add , imf nmf , and imf p are removed as noises. Next, NMF is used to denoise the imf nmf .
3. Because the dimension of input matrix V in NMF algorithm is greater than 1, and the selected imf nmf is a vector. Thus, the method of "signal time sequence translation" is used to construct the input matrix V in NMF. The details of this construction are as follows: • First, this method cyclically shifts the sequence imf nmf to the left for p positions and splices the left overflow part to the right end of the sequence. This translation operation is repeated to obtain k different noise samples: s = {s 1 , s 2 , . . ., s k }.
• Second, this method superimposes imf p and imf add as imf o : imf o (t) = imf p (t) + imf add (t).
• Then, this method accumulates s and imf o (a row vector), respectively, to construct a noisy matrix V, as shown in Eq (4), whose dimension is k × n, where n is the duration of the raw sEMG signal.
The signal-to-noise ratio (SNR) of V is approximately equal to the raw sEMG signal. Thus, the matrix V retains the effective components of raw signal, and this cyclic shift operation only alters the noise shape of raw sEMG signal.
4. Next, the NMF method is used to decompose the matrix V k × n into the weight matrix W k × r and the activation coefficient matrix H r × n is defined according to Eq (2), where k is the number of channels of signal and r is the number of activation modes of signal. The i-th column of W represents the contribution of each channel to the i-th activation mode. The i-th row of H describes the trend of the i-th activation mode status changing with time, which is called the muscle activation curve. The steps that obtain the denoised imf nmf from matrix V are as follows: • The intensity of each activation mode can be measured by calculating the area surrounded by the muscle activation curve and time axis. Because the sampling frequency of the sEMG signal is high enough (i.e., 3000.0 Hz), the activation intensity can be calculated using Eq (5).
• After setting different r values, the corresponding number of muscle activation curves is obtained. In Section 3.4.2, we compare the denoising effect under different values of r and conclude that the denoising effect is best when r is equal to 2. Two muscle activation curves, h 1 and h 2 , can be obtained, and the denoised imf nmf can be expressed as: imf' nmf = (h 1 + h 2 )/2.

Finally, our method adds imf' nmf and imf
is the denoised signal of raw sEMG signal after applying the NMFSEMD denoising method, which is used in the subsequent muscle force prediction of FOS method.

sEMG-force modeling using FOS
The FOS method uses a series of linear or nonlinear basis functions and coefficients to establish a model whose output quickly approaches the real value of system to minimize the error between the estimated and actual values [12]. In the algorithm, an implicit orthogonalization method is used to transform the candidate functions into a set of orthogonal candidate functions [36,37] {p m }, and their corresponding coefficients are expressed by a m , as shown in Eq (6). The MSE of the system is defined in Eq (7).
whereŷ is the estimated system output, y is the true value of system, e is the model error, and {p m } is a set of orthogonal candidate functions to be selected. All candidate functions are traversed by calculating the MSE reduction Q m of each candidate function, and selecting the function add to model and remove from function set, which is corresponding to the largest Q m . We select a function from the remaining candidate functions and repeat the above steps until the termination conditions of FOS algorithm are met. The commonly used termination conditions include the following: the number of functions in the model reaches a preseted value; the model error is small enough, or the remaining candidate functions cannot considerably reduce the model error. When the search is stopped, the coefficients of each selected function can be calculated to complete the FOS algorithm modeling.

Experiments and results
A series of tests and evaluation experiments were performed to assess the validity and effectiveness of the proposed models and methods. All the simulation programs were written in Python (Pycharm2020) on our PC (32.0 GB RAM, 3.8 GHz Intel (R) Core (TM), and i7-10700K CPU).

Experimental data preparation
In this study, we investigated the preprocessing and modeling of muscle force prediction, and used the experimental data from [38]. The details of data are as follows. In the simulated weightlessness environment of bed rest, the sEMG signals of the biceps brachii, triceps, and brachioradialis muscles and the force signal of handgrip were collected synchronously during the push and pull processes. The maximum voluntary contraction was 100.0%; the signal acquisition frequency was 3000.0 Hz.

Evaluation of sEMG denoising method
In this section, the proposed denoising algorithm NMFSEMD is experimentally verified. Based on the detailed principle of NMFSEMD designed in Section 2.1.3, the key steps of this algorithm are analyzed visually and quantitatively. The implementation process of NMFSEMD algorithm is shown in Fig 2, every key steps in the process are verified in turn: the setting of correlation thresholds cr e and cr th , the screening of IMF components based on threshold, the construction of noisy matrix using the idea of signal time sequence translation, the decomposition of noisy matrix considering the NMF and activation mode parameters, and the reconstruction and fusion of signal. Finally, the denoised sEMG signal can be obtained. Compared with the distribution of IMFs in time-domain and frequency-domain before and after signal denoising, it is verified that NMFSEMD algorithm can suppress the modal aliasing better than the traditional EMD.
The EMD method was used to decompose the sEMG signal x(t) and obtain the signal distribution on different time scales. The decomposed signal was denoted as Q = {imf 1 , imf 2 , . . ., imf m }. Fig 3(A) shows that the EMD decomposition result of biceps brachii signal has 1400.0 sampling points. Moreover, some IMFs, which are not periodic functions with a single time scale, can be observed, that is, a phenomenon of model aliasing is produced by noise. To more intuitively analyze noise in IMFs, the fast Fourier transform (FFT) is used to obtain the spectrum diagram of each IMF, as shown in Fig 3(B). The horizontal and the vertical axes denote the frequency and amplitude, respectively.
In the spectrum, the raw sEMG signal x(t) has a wide frequency domain and a large low frequency amplitude. An increase of the rank of IMFs makes the frequency distribution gradually approach the low frequency. The frequencies of high-rank IMFs are distributed in the range of 0.0 to 50.0 Hz, and no overlap is recorded in the main frequency of sEMG signals. In addition to the discussion in Section 2.1.3, imf add , imf nmf , and imf p should be found to construct the denoised sEMG signal for clarifying the various types of IMFs. Thus, the results obtained on calculating the correlations between raw sEMG signal x(t) and its IMFs using Eq (3) are shown in Table 1.
The experimental analysis in this study is based on the sEMG signals of the biceps brachii. Results shows that the correlations of imf 7 and other IMFs located near the edge of IMFs set are lower, while those of imf 3 and other IMFs located near the middle of IMFs set are higher. According to Section 2.1.3, the selection strategy is used to determine imf add , imf nmf , and imf p . The empirical maximum correlation cr e as 0.7658 can be obtained, and the preseted correlation threshold cr th is set to be 0.35, according to Section 3.4.1. Thus, imf p is imf 3 , imf add is imf 5 , and imf nmf is imf 4 .
Then, the noise mixed in imf nmf should be filtered. First, the temporal shift of imf nmf is used to obtain noise samples s. When the order of signal duration is 10 3 and the order of magnitude of rows of s is about 10, the predicted result will have less error. Therefore, the dimension of s is set to 14 × 1400. Study in [39] has shown that when the sEMG signal duration was from 100.0 to 120.0 ms, it could exhibit some characteristics of an sEMG signal, therefore, the shift length of "signal time sequence translation" is set to 300 data points in this study. Then, by adding s and imf nmf , the input matrix V of NMF whose dimension is 14 × 1400 can be obtained. Subsequently, the matrix V is decomposed into the time-invariant activation mode matrix W and the time-varying activation coefficient matrix H using NMF. Further, it can be discovered that the activation information from signal is more reliable when the ranks of W and H are 2. The denoised imf nmf can be obtained by calculating the mean of all rows of H. Therefore, the denoised signal x'(t) can be obtained by adding imf p , imf add , and the denoised imf nmf . The SNR and EP are used to evaluate the denoising effect, as shown in Eqs (8) and (9). And the SNR and EP results of sEMG signals after using NMFSEMD are shown in Table 2.  where x represents the raw sEMG signal, and x'(t) denotes the denoised signal. A high SNR value implies that the signal has a good noise reduction effect, whereas a high value of EP means the denoised signal can retain more characteristics of raw signal. Based on the comparison of Figs 3 and 4, the frequency of denoised signal is mainly distributed in the middle frequency, and the low-frequency noise has been considerably reduced. The complexity of the time characteristic scale in IMF is reduced after the NMFSEMD denoising processing, and the spectrum curve of each IMF becomes smoother. Fig 5 shows that by comparing the sEMG signals before and after using the NMFSEMD denoising method, a smoother curve can be obtained after denoising with a better filtering effect.

Evaluation of the sEMG-force prediction model
The denoised sEMG signals obtained from the raw sEMG signals using the NMFSEMD denoising method are used as the input of muscle force prediction models. In this study, the FOS method is used to establish the muscle force prediction model. In this model, the candidate functions can be constructed using different function forms, as shown in Table 3. Because  the raw sEMG signal used in this paper is the integrated electromyography value, which is added the candidate function in the form of a derivative. In Table 3, E bi , E tr , and E br represent the denoised biceps brachii, triceps brachii, and brachioradialis signals, respectively. The expressions sigm and _ x represent the sigmoid function and first-order derivation of signal x, respectively.
The denoised signals are used to construct candidate functions. However, the sEMG signal is weak, complex, and changeable, the waveform of candidate function varies considerably. Therefore, the predicted force outcome usually has a large oscillation. To better observe the prediction results, the direct predicted force is interpolated, and the average value of the upper and lower envelopes is taken to resist the fluctuation. Fig 6 shows the comparison curve between the real muscle forces and estimated results of FOS models.
Many experimental results verified the effect of applying the NMFSEMD method in the muscle force prediction model and the prediction effect of FOS prediction model. Table 4 Fig 5. sEMG denoising results. The left side is the synchronous three channels raw sEMG signal, and the right side is the corresponding denoised sEMG signal using the proposed NMFSEMD. Three channels represent (a) biceps brachii signal, (b) triceps brachii signal, and (c) brachioradialis signal.
https://doi.org/10.1371/journal.pone.0272118.g005 Table 3. Set of candidate functions of FOS muscle force prediction model, which comprises the common, quadratic, square-root, sigmoid, and gradient function sets. shows when the FOS is used as the prediction model, the prediction results of input data that are not processed using the NMFSEMD method are compared with the input data that are processed using the NMFSEMD method. Additionally, Table 4 shows when the NMFSEMD denoising method is used as the preprocessing method, the results of prediction model that using the FOS method are compared with the model that using the LSSVM method. The prediction result is evaluated using the MSE value, as shown in Eq (7).  Table 4; (b) presents the results of experiment data2 in Table 4. https://doi.org/10.1371/journal.pone.0272118.g006

Discussion on the influence of EMS parameters.
According to Section 2.1.3, before screening the IMFs using NMFSEMD, the upper and lower bounds of the correlation should be determined by comparing the preseted correlation threshold cr th and the empirical maximum correlation cr e . The cr th represents the boundary between the IMFs that should be discarded and the IMFs that should be denoised, and it has a small value which is usually less than 0.5. The cr e represents the boundary between the IMFs, which should be denoised, and the IMFs, which can be directly used to construct the denoised sEMG signal, and it has a big value which is usually bigger than 0.5. To determine the values of both these parameters, we counted the correlation of 30 sets of IMFs, as shown in Figs 7 and 8.
After the EMD decomposition of raw sEMG signal, a set of IMFs with position indexes at {1, 2, 3, . . .} can be obtained. The correlation distribution of each index in a set of IMFs is shown in Fig 7. The correlation at the 1st index is less than 0.2 while those at the 2nd, 6th, 7th, and 8th indexes are distributed in a range of 0.0 to 0.4. Moreover, the correlation at the 3rd, 4th, and 5th indexes are distributed at a higher value in a range of 0.3 to 0.7. Therefore, we set the maximum of the correlation at indexes of 3rd, 4th, and 5th as the cr e , which limited the upper bound to a big value and ensured that the IMF whose correlation was greater than this value had a high correlation with the raw signal and contained most of the useful information about the raw signal. In the 30-group experiments, cr e can be written as cr e = 0.6125±0.0744. Fig 8 shows that the correlation distribution of IMFs in each experiment is more evenly distributed between 0.0 and 0.8. Therefore, in this study, the cr th , which can distinguish between the IMFs with small correlation (IMFs that need discarding) and the IMFs with medium correlation (IMFs that need denoising), is set in the range of 0.3 to 0.4.

Discussion on the influence of NMF parameters
In the abovementioned discussion, before using NMF to separate noise from sEMG signal, the rank r of matrix W and H should be preseted; where r also represents the number of muscle activation modes in a sEMG signal. Therefore, the performance of NMF method with different numbers of r can be discussed; and it also can be evaluated through the reconstruction rate (RSR) parameter, which is proposed in [40] and can be calculated using Eq (10). The larger the RSR value is, the more reasonable the setting of the number of activation modes and the more reliable the activation information obtained from H would be.
where, V k × n is the input matrix of NMF, and W k × r and H r × n are the output matrices of NMF.
When setting different numbers of activation modes, different matrices of W and H are obtained. Table 5 shows the RSR results of 6 groups of sEMG data with different numbers of r. Let us take Data 1 as an example. When r is 2, the RSR of V is 94.37%, which is large enough to represent muscle activity. When r is increased to 3, the RSR is 94.53%, which is an increase of < 0.5%. Therefore, continuing to increase the rank of matrix will not only bring more useful information but increase the noise-related information and subsequent amount of calculation, as shown in other groups of data in Table 5. Therefore, setting the number of muscle activation modes (r) to 2 in this study is suitable for sEMG signal.

Conclusions
In this study, an NMFSEMD algorithm is proposed to denoise the sEMG signals by analyzing the characteristics of non-stationarity and non-linearity of sEMG signals. First, the sEMG signal is decomposed into a set of IMFs, each of which contains different time scale characteristics. Then, the role of each IMF is distinguished in the denoised sEMG signal using the correlation between each IMF and the raw sEMG signal. Consequently, the screen operation of IMFs is performed to remove or reduce noise in each IMF. Finally, the denoised sEMG signal is obtained using the rule of reconstruction. By analyzing the results of the SNR and EP and the comparison of intuitive curves, it can be concluded that NMFSEMD can effectively filter out the noise of sEMG signals. For the muscle force prediction based on sEMG signals, the FOS algorithm in system identification is used to establish prediction model. The key point in establishing the model lies in setting of the form of candidate functions. The commonly used forms in sEMG applications are square, square-root, sigmoid activation functions, etc. Considering the input sEMG signal is the integral electromyogram value, a candidate function in the form of a derivative is added in this paper. In the experiment, when the input is the signal denoised using the NMFSEMD, the MSE value of FOS prediction model will be less than that of the LSSVM method. When the prediction model is FOS and the input data are the sEMG signals (with traditional preprocessing) and the sEMG signal denoised using the NMFSEMD, the MSE value of the latter will considerably be lower than that of the former. Comparing the predicted muscle force curve with the measured muscle strength curve, it can be concluded that the FOS prediction model uses the NMFSEMD-based denoised sEMG signal can achieve a better prediction result.  Resources: Zhiqiang Zhang, Duming Wang, Chunhui Wang.